Discovering Hidden Relationships via Efficient Co-clustering of Sparse Matrices

نویسندگان

  • Brian Thompson
  • Linda Ness
  • David Shallcross
  • Devasis Bassu
چکیده

We present a new approach for discovering hidden relationships in sparse bipartite data by identifying large, dense biclusters in the corresponding matrix. We motivate a new class of metrics to measure the quality of a bicluster partition, and compare them to existing metrics. We then present a heuristic algorithm that efficiently searches the space of possible co-clusterings for one which maximizes the value of a given metric. We evaluate our approach with experiments on synthetic and real-world datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Enriched Nonparametric Bayesian Co-clustering

Co-clustering has emerged as an important technique for mining relational data, especially when data are sparse and high-dimensional. Coclustering simultaneously groups the different kinds of objects involved in a relation. Most co-clustering techniques typically only leverage the entries of the given contingency matrix to perform the two-way clustering. As a consequence, they cannot predict th...

متن کامل

Coronavirus: Discover the Structure of Global Knowledge, Hidden Patterns & Emerging Events

Background & Objective:  The present study aimed at exploring the structure of global knowledge, hidden patterns, and emerging Coronavirus events using co-word techniques. Co-word analysis is one of the most efficient scientific methods to analyze the structure and dynamics of knowledge and the general state of research.  Materials & Methods:  This applied research performed using Co-word anal...

متن کامل

Image Classification via Sparse Representation and Subspace Alignment

Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...

متن کامل

Data mining using the crossing minimization paradigm

Our ability and capacity to generate, record and store multi-dimensional, apparently unstructured data is increasing rapidly, while the cost of data storage is going down. The data recorded is not perfect, as noise gets introduced in it from different sources. Some of the basic forms of noise are incorrect recording of values and missing values. The formal study of discovering useful hidden inf...

متن کامل

A Survey on Novel Graph Based Clustering and Visualization Using Data Mining Algorithm

As the amount of data stored by various information systems grows very fast, there is an urgent need for a new generation of computational techniques and tools to assist humans in extracting useful information (knowledge) from the rapidly growing volume of data. Data mining (DM) is one of the most useful methods for exploring large data sets. Clustering, as a special area of data mining, is one...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013